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With growing power of computer and blend of intelligent soft wares, the 
interpretation and analytical capabilities of system had shown an excellent 
growth, providing intelligence solutions to almost every computing problem. 
In this direction here we are trying to identify how different geocomputation 
techniques had been implemented for estimation of parameters on water 
bodies so as to identify the level of contamination leading to different level 
of eutrophication. The main mission of this paper is to identify state-of-art in 
artificial neural network paradigms that are prevailing and effective in 
modeling and combining spatial data for anticipation. Among this our 
interest is to identify different analysis techniques and their parameters that 


Eutrophication are mainly used for quality inspection of lakes and estimation of nutrient 
Geocomputing pollutant content in it, and different neural network models that offered the 
Spatial imagery data forecasting of level of eutrophication in the water bodies. Different 
Water quality techniques are analyzed over the main steps;-assimilation of spatial data, 
statistical interpretation technique, observed parameters used for 
eutrophication estimation and accuracy of resultant data. 
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1. INTRODUCTION 

Eutrophication is a process of increase of nutrient content in water bodies, such as lakes, estuaries, 
rivers, or slow-moving streams that accelerates the plant growth such as algae, periphyton and plant weeds in 
excess. This enhanced growth of plants is known as an algal bloom, which in turn results with a low 
concentration of dissolved oxygen a state known as hypoxic and decaying of certain feeble plant species over 
others were favored, and is likely to cause severe reductions in water quality. The nutrients content would 
reach water bodies from many sources, such as fertilizers applied to agricultural fields, landfills near rivers; 
deposition of nitrogen from the atmosphere; erosion of soil; sewage treatment plant discharges etc. 
Eutrophication decreases the resource value of rivers, lakes, and estuaries having adverse affect on water 
usage for irrigation, fishing, aquatic life of plants and animals thus making water ecosystem unbalance. Sever 
health problems emerge where eutrophic conditions interfere with drinking water treatment. As per article 
published in by Jin [1] showed that all lakes studied were undergoing the eutrophication process. In the year 
1970’s, most of the lakes were with 91.8% water accounts, mesotrophic stage. In the next decade the 
percentage of lakes with oligotrophic status decreased by approximate 3%-0.5% and eutrophication increases 
by 5%-55%. By the year 2008 about 60% of lakes in china were in eutrophic and hypertrophic condition and 
further predicted that by 2030 all urban lakes would share the same status. 


Journal homepage: http://iaesjournal.com/online/index.php/IJAI 


136 o ISSN: 2252-8938 


Eutrophication is one of the largely growing pollution problems in inland water bodies around the 
globe, thus restoration of water bodies need to have intelligent computation techniques in order to analyze 
over the current status of aquatic ecosystem along with alarming about the future characteristics by time 
series forecasting. With this fast computing world and emerging technology data analyzer need to be 
powered by artificial intelligent techniques. As the interpreter, modulator and predictor neural networks had 
emerged as a capable technique that generates approximate accurately complicated non-linear input-output 
relationships. A neural network is parallel-distributed processor that posses property inspired by human 
cognition system. ANN have the ability of computing, processing, prediction and classification of data and 
had advantages of nonlinearity, input-output mapping, adaptively, generalization and failure resistive [2]. 

In this direction, we are going to discuss some computing techniques used for interpretation and 
estimation of water bodies’ status on eutrophication level taking in account spatial and temporal data as input 
parameters. This paper is further divided into Section II which briefs about geocomputation with major 
techniques as pattern recognition, spatial data analysis and artificial intelligence techniques. In the next part; 
Section II, related papers are surveyed and compares the identification of techniques used by the researchers 
in computing spatial data and generation of desired results. Section IV concludes with the model or 
architecture for the formulation of a system for enhanced inferences on eutrophication estimation. 


2. TAXONOMY OF GEOCOMPUTATION 

Geocomputation is an emerging field with a wide scope of research, that proponent the involvement 
of computation based approaches such as neural networks, heuristic search and computational automata 
design for spatial data analysis. This new interdisciplinary field beyond just implementation of statistical 
techniques for spatial data with basic essence of cognition in them thus coined as "geocomputation" by 
Openshaw and Abrahart, [3] and expanded by Longley and Brooks [4], that describes the use of computer- 
intensive methods for knowledge discovery in geography, especially those that employ non-conventional 
data clustering and analysis techniques and further elaborated to include spatial data analysis, dynamic 
modeling, visualization and space-time dynamics. 

The major geocomputation evolution factors were: computerized data-rich environments and 
capability of computer to record and process over big data, affordable computational power; with emerge of 
virtualization and currently cloud computing implementations which provides high computation at lower 
costs, and lastly all the research efforts towards statistical techniques and mining algorithms and architecture 
that spatial data analysis and mining techniques took this ahead. For further enhancement in this area more 
computational research should be endorse with computer-based pattern search, exploratory spatial data 
analysis techniques, artificial intelligence approaches with more powerful heuristic searches algorithms, 
knowledge processing systems and dynamic modeling that could leverage real-time scenario of physical 
landscapes and other attributes. 

We had investigated only the major techniques that encountered in the analysis of imagery 
geospatial data for computing characteristics of lake water conditions and these are as follows: 


2.1. Spatial Data Analysis and Mining Techniques 

Geographical Information Systems (GIS) are large domain computing systems that facilitate 
capturing, storage, retrieval, managing and analyses of spatial data that had geographical content in them. 
These systems utilize different geospatial analyzing techniques to impart accurate and meaningful results out 
of accessing structured imagery data received from the imagery satellites. 

As defined by Bwozough, cited in [5], spatial analysis in GIS involves mainly three types of 
operations: 1) Attribute Query which is also known as non-spatial (or spatial) query, 2) Spatial Query and 3) 
Generation of new data sets from the original database. Combining all three steps the whole process starts 
with simple attribute query about spatial data and next comes the processing of spatial query and lastly, the 
new data set is generated from these queries that serve as an alternative data source or information. Every 
spatial data before analysis undergoes for spatial autocorrelation, which is a value additive step recognized 
as an essential feature in spatial data preprocessing stage, and following measures such as the correlation 
coefficient, Moran index, join count statistic, Geary’s C, Getis-Ord G statistic and the semi-variogram plot 
have been employed to assess the global association of the data sets [6]. 


2.2. Geovisualization/Computer Based Pattern Recognition 

Geovisualization is one of the most important aspects as it smoothes the progress of analysis by 
conversion of imagery data into the tabular form where different statistical analysis techniques could be 
implemented. Apart from two and three-dimensional mapping, that includes analyzing over the physical 
surface and connection among different terrains natural as well as man-made. MacEachren & Kraak [7] 
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characterized the major aspects of Geovisualization and assembled them in a process represented by a three- 
dimensional cube with explore, analyze, synthesize and present as major tasks. 

The initial step of Geovisualization is about exploring and analyzing geographical data which is a 
spatial data captured from highly precise and powerful sensors of various satellites and stored into 
computational devices for further references. It starts with processing of satellite imagery data using digital 
image processing tools which then get converted into the tabular form that could easily interpret using 
different statistical techniques. The spatial data collected is further put into for auto-correction, interpolation, 
pruning, normalization and stored in a meaningful format. Synthesize is all about delivering the new 
outcomes out of raw data in more meaningful and after implementation of series of statistical techniques and 
methods. The last step of presentation is basically to represent the information into more a generalized format 
known as knowledge which also incorporates the visualization of geographical images on global maps like 
Google map or any other GIS tools which could result in predictions of different aspect and status of physical 
aspects of the earth. 

While implementation of geostatistical techniques, some of these commonly used multivariate 
techniques are cluster analysis (CA), factor analysis/principal components (FA/PCA) and discriminant 
analysis (DA) which results into an effective data management network, monitoring system for spatial data 
reducing the on ground sampling cost, labor and effort that results more accurate picture of landscape 
variations. 

The author in his work [8] demonstrated that water samples from major sampling stations were 
collected in due time span and exploratory analysis of data was made by box plots, ANOVA, display 
methods (principal component analysis) and unsupervised pattern recognition (cluster analysis) to analysis 
over the source of variations of water quality. Point source analysis for pollution identification, nutrient 
origination sources like municipal wastewater were demonstrated and thus classification of river water 
samples was achieved using PCA and cluster analysis. 

In the studies presented by Alberto et al., [9] and Singh et al., [10], both spatial and temporal data 
for rivers are evaluated for quality analysis, where different parameters from scattered stations are collected 
formulating a complex data matrix, and then treated using the cluster analysis (CA) that renders good results 
as a first exploratory method to evaluate both spatial and temporal differences, factor analysis/principal 
components (FA/PCA) which were helps in identifying group components and discriminant analysis (DA) 
that showed best results for reduced data dimensions in large datasets. This study presents inevitability and 
usefulness of multivariate statistical techniques for evaluation and interpretation of the large number of 
complex data on water quality with a sight to access better information and further designing an effective 
monitoring and management network for water resources. 

Geographical Information System (GIS) are meant for resource management and is an efficient 
decision-making tool however, powered with lots of sophisticated technology it is not ready used by common 
peoples because of lack of facility to distribute the analyzed information in an efficient manner. In further 
advancement, Caquard et al., [11], proposed the cartographic representation of water quality mapping 
information which can propagate the customized result on bases of variation in clientele. With the use of 
interpolation and extrapolation technique, visual data is put into for correlation and clustering analysis, along 
with the same gradation of colors are used to represent the level of water quality from lower to higher. The 
pattern recognition using some powerful learning techniques are additives that making the geographical view 
more realistic to the human eyes. Further to impart the higher resolution and moving ahead on adding 
dimensions to the perspective views, animation techniques had also enhanced the geovisualization 
experience. 


2.3. Machine Learning Techniques 

Artificial Intelligence explored a new horizon of intelligent machines that are enabled with 
cognition power with the huge database to store and process information so as to impart knowledge and 
facilitates in the day today working environment. Unlike statistical techniques which are expert in predicting 
sense out of linear data, machine learning techniques would stand for analyzing over non-linear data set 
which actually prevail real world representation. Câmara et al, [12] in the initial phase of emergence of 
geocomputation advocated towards the exploit of computational-intensive techniques such as neural 
networks, heuristic search and cellular automata for spatial data analysis. 

With the evolution of techniques like Artificial Neural Networks (ANNs), Support Vector Machines 
(SVMs) and Cellular Automata (mainly for simulation of machines behavior), these approaches marked their 
importance in analyzing spatial and temporal data because of their native ability towards modeling of 
complex nonlinear datasets. ANNs are enabled with the ability for supervised as well as unsupervised 
learning modeling for basic tasks like classification, clustering, and prophecy that can be drawn out of 
regression analysis of empirical data sets. 
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Another machine learning method i.e. Kernel methods mentioned by Boser et al., [13], with kernel 
actions usages for mapping with higher dimensional features space without explicit computation of maps. 
Research in geospatial data modeling is molding towards intelligent software tools developed that are 
developed under the framework of Machine Learning Office, and few to mention are topo-climatic modeling, 
natural hazard assessments, which includes; heavy rainfall resulting in landslides, avalanches in higher 
altitudes, pollution mapping; air and soil pollution along with indoor radon and heavy metals presence, 
natural resources assessments, remote sensing imagery data classification, socio-economic data analysis and 
geovisualisation, etc. [14]. 

The Figure 1 shown below depicted a broad taxonomical breakdown with major technology that 
covers the whole aspect of geocomputation. 


Spatial Data Analysis and Machine Learning Geovisualization / Computer 
Mining Techniques Techniques Based Pattem Recognition 


Figure 1. Taxonomy of Geocomputation Techniques 


3. STATE OF THE ART 

In this section the different techniques on eutrophication evaluation are discussed with the main 
focus on the tradition methods and their methodologies which still are the basic building block of every 
geocomputing system that deals with the similar type of problem appraisal. The two main sections explain 
the data essimilation from eutrophicated water body and management of these data for further decision 
making or in restoration mechanisms. The next subsection presents the state of the art in neural network for 
various similar water quality examination problems where the primary data are spatial data from different 
satellite sensors with variant metadata and of variant type of sources like lake, ocean, sea etc. 


3.1. Data for Eutrophication Management and Control 

Water is the main stream of life on earth and presently along with the scarcity of water, its quality is 
also a prominent issue to deal with thus we had preferred the problem of eutrophication of water bodies for 
which we are trying to identify the geocomputation techniques implemented to measure the current scenario 
on quality and futuristic methodologies. As mentioned in the publication (http://www.unep.or.jp) [31] , 
different ways of monitoring for the impacts of eutrophication and to establish management options are: 1) 
chemical monitoring: that focus on total phosphorus content measurement however, it is Nitrogen which had 
more significance over nutrient content valuation for eutrophication estimation , also chemical monitoring is 
more difficult in a lake or reservoir environment.; 2) bio-assessment: which accounts over main resultant of 
eutrophication i.e. abundant growth in biomass, which in turn measure for chlorophyll-a and also 
concentration of particulate organic carbon (POC) content, however not very apt for routine monitoring 
system.; and 3) estimated techniques: which includes point source and non-point source estimation 
techniques that found most apt for eutrophication monitoring system which includes phosphorous data 
together with other knowledge on land, demography etc., that could easily integrated for elucidation. 
However these may vary over results because of various factors like; spatial factors, auto-correction, 
classification and other techniques that are used for analysis and this is more about accuracy in computation 
techniques. 

Geo spatial data from satellites are easily available among which Landsat imagery being widely 
used with data from three major sensors: Multi-spectral Scanner (MSS), TM (Thematic Mapper) and ETM+ 
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(Enhancer Thematic Mapper Plus) where Baruha et al., [15], Sudheer et al., [16], Canziani et al., [17], and 
Guan et al., [18], make uses of these imagery spatial data for analysis of water quality using different 
geocomputation techniques. Also some data from SPOT satellite images, MODIS imagery and IRS P6 
sensors imagery were exploited to study the sediment and nutrient level in water bodies, Mohamed et al., 
[19], Xue et al., [20], and Sheela et al, [21] respectively. 


3.2. Artificial Neural Network Models 

Limited water quality data and the high cost of water quality monitoring often pose serious 
problems for process-based modeling approaches to modulate the same for time series forecast. ANNs 
provide reasonable implementation options, because they are computationally very fast and require many 
fewer inputs parameter and inputs conditions than deterministic models. On the other hand to behave 
cognitively they do requisite for a large pool of representative data sets for training and appropriate learning 
algorithms. 

ANNs are experimented for its usefulness in water quality prediction; nevertheless SVM had also 
demonstrated good results for same. A comparative study on ANN and SVM in [22], showed the superior 
result by later technique where authors predicted water quality of rivers by estimating total nitrogen and total 
phosphorus observed. In another research by Liao et al., [23], water quality is assessed using SVM and 
genetic algorithm which proved to deliver acceptable results and also efficient enough for classification of 
water quality. Chu et al., [24] proposed a case study where Hopfield neural network is embedded with Factor 
Analysis (FA) techniques to form Factor Analysis-Hopfield Neural Network (FAHNN) to identify the 
assessment factors for water quality measurement that proved to provide more reliable judgment and valuable 
information as compared with alike techniques. 

The result imparted by the models also depends on number of data set and used training approach 
along with the learning techniques and tools which together have a direct impact on the quality of results 
announced by the system. ANNs are able to approximate accurately complicated non-linear input-output 
relationships. Firstly ANNs is requiring training or calibration on the basis of lots of statistical data. After 
training, ANN is being tested or verified for some input whose output is already known. The ANN 
techniques are flexible enough to accommodate additional constraints that may come up in the application. 
Moreover, ANN model can reveal hidden relationships in the chronological data, thus aiding the estimation 
of nutrient pollutant. There are more applications of prediction based implications of neural network like 
forest covered area, land usage modeling, natural resource estimation, natural calamity (flood, landslide, 
cyclone etc.) effected area analysis and much more which had a direct impact on natural resource 
management and planning systems. 

Among the initial application of neural network modeling in evaluation of imagery data from 
satellite sensors (using LandsatTM) by Baruha et al. [15] and Panda et al. [25], the research comes out with 
some promising results which effective and simple implementation on the estimation of lake water quality 
mainly concentration of chlorophyll and solid sediments. Unlike the former which uses the most common 
model Back Propagation neural network model, later implemented Radial basis function neural (RBFN) 
network. They were found to be better over traditional regression analysis and were quite useful in the 
manifesting basic characteristic model of variant sized water bodies. Analyzing over optical distinctiveness 
of selective bands from LandSat5 TM and LandSat7 ETM+ imagery data, a methodology was presented by 
Canziani et al. [17], to infer the tropical state index of lakes. ANN model with multilayer perceptron and 
back propagation training algorithm were implemented to determine chlorophyll-a and total suspended solids 
concentrations which prove to be apt in understanding the complex dynamic behavior of water bodies. 

The latest work presented in by Mohamed et al. [19], was inspired by researchers Moses et al. [26] 
and Gholamalifard et al. [27] using Multilayered Perceptron (MLP) neural network model on satellite images 
IRS P6 LISS III and Landsat respectively, that contributes towards detection of Lake Bathymetry using 
artificial neural network modeling on reflectance of green, red, both and four band combinations of SPOT 
image, as compared with polynomial correlation algorithm with reflectance from green band and Generalized 
Linear Model (GLM) with reflectance from green and red band. It demonstrates that ANNs impart more 
accurate results in terms of least value of root mean square than other conventional methods for bathymetric 
application, also the ANN using all bands having precedence to the other with single band usage. 

In order to reach to the perspective outlook the power of artificial neural network was assessed for 
water quality estimation on inland water bodies, as to recognize the status of algal bloom in water body is 
also an essential part which further facilitates the restoration of process. With the same focus, Xue et al. [20] 
applied the algal bloom index to in situ remote sensing reflectance and MODIS Rayleigh-corrected 
reflectance along with the speed of local wind. The simple statistical technique, Classification and Regression 
(CART) Model is applied to the above data in order to identify the vertical profile distribution of 
phytoplankton biomass. The study concludes that similar decision tree approach could be used with other 
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satellite imagery data for monitoring and continues assessment of the level of eutrophication to other 
hydro bodies. 

IRS P6 LISS III imagery data from is being analyzed to predict Secchi disk depth (SDD) of a lake 
with the usage of Multilinear Regression (MLR) model using all four bands (green, red, NIR and MIR) as 
independent variable and SDD dependent variable [21]. The computed results are found to be superior to 
regression model on spectral ration and individual band analysis and water found to be at 
hypereutrophic level. 

In the work of Chen et al. in [28], three neural network models Radial Basis Function Neural 
Network (RBFN), Adaptive Network based Fuzzy Inference System (ANFIS) and Multilinear Regression 
(MLR) were developed and compared to examine over mean absolute error, the root mean square error and 
the correlation coefficient. These models were developed mainly to predict over dissolved oxygen, total 
phosphorus, chlorophyll-a and Secchi disk depth in the reservoir among which neural network ANFIS 
showed to be most suitable for simulating the water quality parameters with reasonable accuracy. A short 
outcome of the assessment is depicted in the tabulation format mentioned as the Table 1. 


Table 1. Artificial Neural Network in Assessment of over Different Parameters of water bodies 


Citations Input ANN Model and Major Functionality Key facets 
Parameters Training Approach 
(Imagery/ 
Statistical Data 
Source) 
Baruha et al., Landsat TM Back Propagation To estimate chlorophyll LandsatTM preferred over 
2001[15] Imagery Neural Network with concentration and sediments of MODIS and SeaWiFs sensor 
single hidden layer water bodies cause of low spatial resolution 
Panda et al., 2004 Landsat TM Linear Regression To determining the Cost-effective, quick, and 
[25] Imagery statistical model (LMR) concentrations of chlorophyll-a feasible with accuracy in 
Radial basis function (chl-a) and suspended matter predicted and actual results. 
neural (RBFN) network (SM) Information for all bands 
preferred over single band. 
RBNF to be more robust than 
LMR 
Canziani et al., LandSat 5TM Multilayer Perceptron To determine chlorophyll-a and Remote sensors data processed 
2008 [17] and LandSat 7 with Back Propagation total suspended solids by ANN are useful for 


ETM+ Imagery 


Learning algorithm 


concentrations for understanding 
the complex dynamic behavior 
of water bodies. 


monitoring the transformations 
in shallow lakes 


Gholamalifard et Landsat STM Multilayer Perceptron To extract the bathymetry ANN estimated depth with 
al., 2013 [27] used with Back information of southeastern good accuracy even with 
Propagation Learning Caspian Sea relatively less in situ data sets 
algorithm and fairly poor sensor imagery. 
Moses et al., IRS P6 LISS Three-layered feed To estimate Lake bathymetry All four band data set used and 
2013 [26] M imagery forward neural network also estimating Secchi Disk system imparts improved 
with back propagation Transparency (SDT) prediction accuracy 
training algorithm 
Xue et al., 2015 MODIS Classification and To identify vertical distribution Same approach with other 
[20] Imagery Regression Tree Model profile of phytoplankton based satellite data could be 


(CART) 
Statistical Techniques 


on algal bloom index of Lake 
Chaohu. 


applicable for monitoring of 
algal biomass in other similar 
hydrology. 


4. FUTURE SCOPE AND CONCLUSIONS 
From our study we had identified that usage of imagery data; mainly Landsat TM, for secchi disk 


depth (SDD) and tropical state index (TSI) are known to be estimated for inland lakes in our region by 
researchers, [29] and [30]. Now for our future research direction we are intended to develop a overall 
geocomputing system which would analyze imagery data, extract the SDD, TSI, chlorophyll-a, dissolved 
oxygen and other parameters, train the neural network with following so as to learn the basic characteristics 
of selected water body and predict the current and future status of eutrophication using time series prediction 
so as to facilitate the restoration phenomenon. 

In this assessment, we reviewed over the type of imagery data exercises by different systems and 
artificial neural network model and there learning techniques. The main objective was to explore the power 
of various neural network model in geocomputation and to draw a road map for some more technologically 
advanced systems that would be cognizant and if deployed could be easy to predict the futuristic behavior of 
water bodies ultimately envisage the quality of water in the system. 
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